138 research outputs found

    Alloy Informatics through Ab Initio Charge Density Profiles: Case Study of Hydrogen Effects in Face-Centered Cubic Crystals

    Full text link
    Materials design has traditionally evolved through trial-error approaches, mainly due to the non-local relationship between microstructures and properties such as strength and toughness. We propose 'alloy informatics' as a machine learning based prototype predictive approach for alloys and compounds, using electron charge density profiles derived from first-principle calculations. We demonstrate this framework in the case of hydrogen interstitials in face-centered cubic crystals, showing that their differential electron charge density profiles capture crystal properties and defect-crystal interaction properties. Radial Distribution Functions (RDFs) of defect-induced differential charge density perturbations highlight the resulting screening effect, and, together with hydrogen Bader charges, strongly correlate to a large set of atomic properties of the metal species forming the bulk crystal. We observe the spontaneous emergence of classes of charge responses while coarse-graining over crystal compositions. Nudge-Elastic-Band calculations show that RDFs and charge features also connect to hydrogen migration energy barriers between interstitial sites. Unsupervised machine-learning on RDFs supports classification, unveiling compositional and configurational non-localities in the similarities of the perturbed densities. Electron charge density perturbations may be considered as bias-free descriptors for a large variety of defects

    Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology

    Get PDF
    Motivation: High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a ‘large P, small n’ setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses. Results: We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs. Availability and implementation: The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html. Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online.This work was supported by European Research Council Advanced Investigator Grant ERC-2013-AdG 340642 – TRIBE.This is the author accepted manuscript. The final version is available from Oxford University Press via http://dx.doi.org/10.1093/bioinformatics/btv56

    Improving CC-NUMA performance using instruction-based prediction

    No full text
    We propose Instruction-based Prediction as a means to optimize directory-based cache coherent NUMA shared-memory. Instruction-based prediction is based on observing the behavior of load and store instructions in relation to coherent events and predicting their future behavior. Although this technique is well established in the uniprocessor world, it has not been widely applied for optimizing transparent shared-memory. Typically, in this environment, prediction is based on datablock access history (address-based prediction) in the form of adaptive cache coherence protocols. The advantage of instruction-based prediction is that it requires few hardware resources in the form of small prediction structures per node to match (or exceed) the performance of address-based prediction. To show the potential of instruction-based prediction w

    The Use of Instruction-Based Prediction in Hardware Shared-Memory

    No full text

    Identification And Optimization Of Sharing Patterns For Scalable Shared-Memory Multiprocessors

    No full text
    Distributed shared-memory architectures typically employ a directory-based protocol to maintain cache coherence. Identifying sharing patterns in parallel programs and applying specialized optimizations can increase cache-coherence protocol efficiency and yield performance improvements. In this thesis, I propose and study both optimizations to sharing patterns and techniques to identify sharing patterns. The main thrust of the thesis is GLOW, a comprehensive optimization for wide sharing---a sharing pattern that is a serious obstacle to scalability to large numbers of processors. I present GLOW in the form of extensions to the SCI ANSI/IEEE standard. GLOW is implemented in special network switches and incorporates characteristics that are not found together in previous proposals: scalable writes and scalable reads, network locality (by exploiting the abundance of widely-shared data to satisfy requests locally), simplicity, transparency to the base protocol, and network topology indepen..

    Ipstash: A power-efficient memory architecture for ip-lookup

    No full text
    High-speed routers often use commodity, fullyassociative

    Ipstash: A set-associative memory approach for efficient ip-lookup

    No full text
    Abstract—IP-Lookup is a challenging problem because of the increasing routing table sizes, increased traffic, and higher speed links. These characteristics lead to the prevalence of hardware solutions such as TCAMs (Ternary Content Addressable Memories), despite their high power consumption, low update rate, and increased board area requirements. We propose a memory architecture called IPStash to act as a TCAM replacement, offering at the same time, high update rate, higher performance, and significant power savings. The premise of our work is that full associativity is not necessary for IP-lookup. Rather, we show that the required associativity is simply a function of the routing table size. Thus, we propose a memory architecture similar to set-associative caches but enhanced with mechanisms to facilitate IP-lookup and in particular longest prefix match (LPM). To reach a minimum level of required associativity we introduce an iterative method to perform LPM in a small number of iterations. This allows us to insert route prefixes of different lengths in IPStash very efficiently, selecting the most appropriate index in each case. Orthogonal to this, we use skewed associativity to increase the effective capacity of our devices. We thoroughly examine different choices in partitioning routing tables for the iterative LPM and the design space for the IPStash devices. The proposed architecture is also easily expandable. Using the Cacti 3.2 access time and power consumption simulation tool we explore the design space for IPStash devices and we compare them with the best blocked commercial TCAMs
    • …
    corecore